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' r ' ' Abstract 

H , 

^0 , We extend the problem of obtaining an estimator for the finite population 

^ c"| mean parameter incorporating complete auxiliary information through calibration 

estimation in survey sampling but considering a functional data framework. 
The functional calibration sampling weights of the estimator are obtained by 
matching the calibration estimation problem with the maximum entropy on 
the mean principle. In particular, the calibration estimation is viewed as an 
infinite dimensional linear inverse problem following the structure of the maximum 
entropy on the mean approach. We give a precise theoretical setting and estimate 
the functional calibration weights assuming, as prior measures, the centered 
Gaussian and compound Poisson random measures. Additionally, through a 
simple simulation study, we show that our functional calibration estimator 
improves its accuracy compared with the Horvitz-Thompson estimator. 
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X ■ 1 Introduction 



In survey sampling, the well-known calibration estimation method proposed by Deville 
and Sarndal [7] allows to construct an estimate for the finite population total or mean 
of a survey variable by incorporating complete auxiliary information on the study 
population in order to improve its efficiency. The main idea of the calibration method 
consists in modifying the standard sampling design weights di of the unbiased Horvitz- 
Thompson estimator Horvitz and Thompson [17] by new weights W{ close enough to 
diS according to some distance function T>(w,d), while satisfying a linear calibration 
equation in which the auxiliary information is taken into account. The estimator based 
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on these new calibration weights is asymptotically design unbiased and consistent with 
a variance smaller than the Horvitz-Thompson one. 

The idea of calibration has been extended to estimate other finite population 
parameters, such as finite population variances, distribution functions and quantiles. 
See, for instance, Rao et al. [26], Kovacevic [18], Theberge [31], Singh [30], Wu and 
Sitter [35], Wu [34], Harms and Duchesne [15], Rueda et al. [27], Sarndal [28], and 
references therein. Recent developments have also been conducted toward, for example, 
the approach of (parametric and non-parametric) non-linear relationships between the 
survey variable and the set of auxiliary variables for the underlying assisting model, and 
a broad classes of conceivable calibration constraints functions (Breidt and Opsomer 
[1], Wu and Sitter [35], Wu [34], Montanari and Ranalli [21]). 

One interesting extension emerges when both the survey and auxiliary variables 
are considered as infinite dimensional objects such as random functions. This 
generalization relies on the fact that, due to improvements in data collection 
technologies, large and complex databases are being registered frequently at very fine 
time scales, regarded these as functional datasets. This kind of data are collected in 
many scientific fields as molecular biology, astronomy, marketing, finance, economics, 
among many other. A depth overview on functional data analysis can be found in 
Ramsay and Silverman [24], Ramsay and Silverman [25] and Horvath and Kokoszka 
[16]. Functional versions of the Horvitz-Thompson estimator have been proposed 
recently by Cardot and Josserand [2] and Cardot et al. [3] for the cases of error free 
and noisy functional data, respectively. 

The purpose of the present paper is to extend the problem of obtaining calibration 
sampling weights using functional data. This is conducted through the generalization 
of the work by Gamboa et al. [11], where the calibration estimation problem, which 
is considered as a linear inverse problem following Theberge [31], is matched with 
the maximum entropy on the mean approach under a finite dimensional setting. The 
maximum entropy on the mean principle applied to our goal focuses on reconstructing 
an unique posterior measure v* that maximizes the entropy S{y \\ v) between a feasible 
finite measure v relative to a given prior measure v subject to a linear constraint. 
Finally, the functional calibration sampling weights are defined as the mathematical 
expectation with respect to v* of a random variable with mean equal to the standard 
sampling design weights dj. In this paper, we reconstruct v* adopting the random 
measure approach by Gzyl and Velasquez [14] under an infinite dimensional context. 

The maximum entropy method on the mean was introduced by Navaza [22, 23] to 
solve an inverse problem in crystallography, and has been further investigated, from 
a mathematical point of view, by Gamboa [9], Dacunha-Castelle and Gamboa [6] and 
Gamboa and Gassiat [10]. Complementary references on the approach are Mohammad- 
Djafari [20], Marechal [19], Gzyl [13], Gzyl and Velasquez [14] and Golan and Gzyl 
[12]. Maximum entropy solutions, as an alternative to the Tikhonov's regularization of 
ill-conditioned inverse problems, provide a very simple and natural way to incorporate 
constraints on the support and the range of the solution Gamboa and Gassiat [10], 
and its usefulness has been proven, e.g., in crystallography, seismic tomography and 
image reconstruction. 
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The paper is organized as follows. Sect. 2, presents the calibration estimation 
framework for the functional finite population mean. In Sect. 3, the connection between 
calibration and maximum entropy on the mean approaches is established, and the 
functional calibration sampling weights are obtained assuming two prior measures. In 
Sect. 4, the respective approximations of the functional maximum entropy on the mean 
estimators are derived. The performance of the estimator is studied through a simple 
simulation study in Sect. 5. Some concluding remarks are given in Sect. 6. Finally, 
the technical proofs of the technical results are gathered in the Appendix. 

2 Calibration estimation for the functional finite popula- 
tion mean 

Let Un = {1, • • • j N} be a finite survey population from which a realized sample a 
is drawn with fixed-size sampling design pjv(a) = ¥(A = a). Here a € A, where 
A is the collection of all subsets A of Un that contains all possible samples of un 
different elements randomly drawn from Un according to a given sampling selection 
scheme, and P a probability measure on A. The first order inclusion probabilities, 
7Tj7v — P(* E a) = Y2aeA(i) Pn(o-), where A(i) represents the set of samples that 
contain the ith element, are assumed to be strictly positive for all i £ Un- See Sarndal 
et al. [29] and Fuller [8] for details about survey sampling. 

Associated with the ith element in Un there exists an unique functional random 
variable Yi(t) with values in the space of all continuous real- valued functions defined on 
[0, T] with T < +oo, C([0,T]). However, only the sample functional data, Yi(t), i € a 
are observed. Additionally, an auxiliary g-dimensional functional vector is available 
for each i e U N , Xi(t) = {X n (t), . . . , X ig (t)) T G C([0,T] q ) with q > 1. The known 
functional finite population mean is denoted by fix(t) = N _1 X^et/jv Xi(t). 

The main goal is to obtain a design consistent estimator for the unknown 
functional finite population mean, = A r ~ 1 ^2 i€U Yiit), based on the calibration 

method. The idea consists in modify the basic sampling design weights, di = 
tt7 1 , of the unbiased functional Horvitz- Thompson estimator defined by /iy T (t) = 
N^ 1 ^2 i€a diYi(t), for new more efficient weights Wi > incorporating the auxiliary 
information. These weights must to be sufficiently close to d^s according to some 
dissimilarity distance function T> a (w,d) on WL, and satisfying the set of calibration 
constraints 

The functional estimator for Hyif) based on the calibration weights is expressed 
by the linear weighted estimator fiy{t) = N -1 ^2 iGa wiYiit). Different calibration 
estimators can be obtained depending on the chosen distance function Deville and 
Sarndal [7]. However, it is well known that, in the finite dimensional setting, all of 
calibration estimators are asymptotically equivalent to the one obtained through the 
use of the popular chi-square distance function V a (w, d) = ^2 i€a (wi — di) 2 /2diqi, where 
qi is an individual given positive weight uncorrelated with d{. 
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Assuming a point-wise multiple linear regression model Ramsay and Silverman 
[25], Yi(t) = Xi(t) T (3(t) + £j(i), where £i{t) is the ith zero-mean measurement 
functional error independent of Xi(t) with variance structure given by a diagonal 
matrix with elements 1/qi unrelated to di, then the estimator for /Uy(i) from the 
restricted minimization problem can be expressed as 

T 



f 1 y(t)=$ T {t) + [n X (t)-il I f{t)} j8(t) 



where p,^ 1 \t) = ^2i ea diXi(t) denotes the Horvitz-Thompson estimator for the 

functional vector X(t), and 3(0 = {J2iea d iQiXi(t)Xi(t) T } 1 J2iea d iQiX i(t)Yi(t) 
is the weighted estimator of the functional coefficient vector (3{t), whose uniqueness 
relies on the existence of the inverse of the matrix ^2 i&a diqiXi(t)Xi(t) T for all t. 

The calibration weights can be generalized allowing functional calibration weights 
Wi(t) which can be obtained from the minimization of the generalized chi-square 
distance T>*(w,d), expressed below, subject to the functional calibration restriction 

N-^WiWXitf) = fx x (t). (1) 

The existence of functional calibration weights is stated in the next theorem, which 
is a straightforward generalization of the finite dimensional results of Deville and 
Sarndal [7]. 

Theorem 1. Assume the existence of a functional vector w(t) = (wi(t), . . . ,w n (t)) T 
such that (1) holds, and the inverse of the matrix X^g a diqi (t) Xi(t)Xi(t) T . Then, for 
a fixed t € [0, T], w(t) minimizes over C([0, T] n ) the generalized chi-square distance 

^(^) = £ 2dm{t) 

subject to (1), where the functional calibration weight Wi(t) for all i € a is given by 

-l 



u>i(t) = di 



1 + qi(t) [iM X (t) - Ax T (0} T | J]^(t)X i (t)X l (t) T I Xi(t) 



Note that, for this generalized setting, the functional calibration estimator for \xy (t) 
is expressed by 

£y(i) = N-^MWW = rf T (t) + {feW - Af T (i)} T 3(i), 

where ^ 

3(t) = \^d i q i {t)X i {t)Xi{t) T \ ^di9i(t)Xi(t)Yi(t), 

provided the inverse of the matrix ^2 i£a diqi(t)Xi(t)Xi(t) T exists for all t. 
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3 Maximum entropy on the mean for survey sampling 



Let (X,J-) be an arbitrary measurable space over which we want to search for an 
(j-finite positive measure [i. The maximum entropy on the mean principle provides 
an efficient way of getting an estimator for some linear functional /iy(i) = J^Y(t)dfj, 
satisfying a known g-vector of functionals X(t)dfi = where Y(t) : X — > 

C{[0,T}) and X(t): X — > C([0, T] q ). 

A natural unbiased and consistent estimator of H Y (t) is the empirical functional 

mean jly(t) = f x Y(t)dfi n = n" 1 Siea where Mn = " -1 Z)iea ^ is the 

corresponding empirical distribution with T\ , . . . , T n an observed random sample from 
fi. Despite properties of this estimator, it may not have the smallest variance in this 
kind of framework. Therefore, incorporating prior functional auxiliary information the 
variance of an asymptotically unbiased functional estimator can be reduced applying 
the maximum entropy on the mean principle. 

The philosophy of the principle consists in to enhance considering the 

maximum entropy on the mean functional estimator 

A™(*) = / Y(t)d^ IEM = n-^mm), for all t e [0,T] , 

where fi^ EM = n~ l YlieaP^)^ T i * s a weighted version of the empirical distribution 
fi n , with p(t) = (p\(t), . . . ,p n (t)) T given by the expectation of the independent n- 
dimensional stochastic process P(t) = (Pi(t), . . . , P n (t)) T drawn from a posterior finite 
distribution u*, p{t) = E„* for all t G [0, T], where v* must to be close to a prior 

distribution v, which transmits the information that fiff EM must to be sufficiently 
close to ji n . 

Therefore, the maximum entropy on the mean principle focuses on reconstructing 
the posterior measure v* that maximizes the entropy, over the convex set of all 
probability measures, S{y || v) = —D{y || v) subject to the linear functional constraint 
holds in mean, 



n 



H X (t), Vie[0,T] 



We recall that D{v \\ v) is the /-divergence or relative divergence or Kullbach- 
Leibler information divergence between a feasible finite measure v with respect to a 
given prior measure v (see, for instance, Csiszar [4]) defined by 



D(u || v) 



/nlog^fJ^-K^ + l if^«^ 
+oo otherwise. 



To establish the connection between calibration and maximum entropy on the 
mean approaches the following notation is adopted Yi(t) = N~ 1 ndiYi(t), Xi(t) = 
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N~ l ndiXi(t) and Pi(t) = iTiWi(t), such that the functional Horvitz-Thompson 
estimator of fJ-y(t) and the functional calibration constrain (1) can be, respectively, 
expressed as 

£*T(t) = N" 1 diYi(t) = n- 1 

and 

n- 1 Y,Pi{t)X i {t) = N- 1 Y,wS)Xi{t)= l i x {t), Vt€[0,T\. 



Ida 



Hence, the functional calibration estimation problem follows the structure of the 
maximum entropy on the mean principle, where the corresponding estimator is defined 

by 

$ EM (t) = n- 1 Y^pi(t)Yi(t) = N-^Mmit). 

The functional calibration weighting vector p(t) with coordinates pi{t) = 7TiWi(t) 
for i S a, is the expectation of the n-dimensional stochastic process P(t) with 
coordinates P%(t) = ffiWi(t), drawn from v* , 

p(t)=E v .[P(t)], Vte[0,T], 

where the posterior measure v* = §t>i£ a v* (by the independence of Pi's) maximizes the 
entropy S(- \\ v) subject to the calibration constraint is fulfilled in mean, 



n 



i£a 



H X (t), VtG[0,T] 



Note that as Pi{t) = iTiWi(t) and Wi{t) must to be sufficiently close to di, then the 
Pi(t) must be close enough to 1 for each i £ a. 



3.1 Reconstruction of the measure v* 

For simplicity and without loss generality we assume that T = 1. The posterior 
distribution u* can be reconstructed adopting the random measure approach for infinite 
dimensional inverse problems explained in detail by Gzyl and Velasquez [14]. To do 
this, we express the calibration constraint (1) as an infinite dimensional linear inverse 
problem writing Wi(t) as 

Wi(t) = / K(s, t)wi (s) ds + di for each i G a, 
Jo 

where K(s, t) is a known continuous, real-valued and bounded kernel function and 
Wi = E v [Wj (s)] , where W is a stochastic process. 
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Hence, the infinite dimensional inverse problem, which takes the form of a Fredholm 
integral equation of the first kind, is 



E v [/CW] = E A V [ f K(s, t)Wi (s) ds + d t 



Xi(t) 



j ^K(s,t)X i (t)ro i (s) ( is + ^d i X i (t) 
Nn x (t), t€[0,l]. 



(2) 



To obtain the functions w* (s) that solve the integral equation ~E U [/CW] = N^i x (t), 
the random measure approach adopted considers Wi (s) as a density of a measure 
Wi (s) ds, i € a. Under this setting, we define the random measure Wj (a, b] = 
Wi(b) - Wj(a) for (a, b] C [0,1] such that dE u {W; (0, s]} = Wi{s)ds for each i G a. 
The next theorem ensures the existence of the posterior distribution v* to obtain the 
functions w* (s) depending on the assumed prior distribution v. 

Theorem 2. Let v be a prior positive probability measure, A = X(t) a measure in the 
class of continuous measures on [0, l] q , Ai {C [0, l] q ), and V = {u <C v. Z V (X) < +00} 
a nonempty open class, where Z V (X) = E v [exp {(A, /CW)}], with 



(A,/CW> = f 1 X T (dt) I f Y^K(s,t)X i (t)dW i (s) + Y^diXiit) ) . 

J \J0 iea iea J 



(3) 



Then there exists an unique probability measure 

v* = argmaxS^f || v), 

subject to E u [/CW] = Nfi x (t), which is achieved at 

du*/dv = Z~\X*)exp{{X*,)CW}} 
where X*(t) minimizes the functional 

H v (X) = logZ v (X)-(\,Nfi x ). 



Based on the Theorem 2, we will carry out the reconstruction of v, assuming the 
centered Gaussian and compound Poisson random measures as prior measures, in order 
to estimate the respective functional calibration weights Wi(t), i € a. The estimates 
are given by the following two Lemmas. 

Lemma 1. Let v be a centered stationary Gaussian measure on (C([0, 1]),^(C([0, 1]))), 
and A = X(t) G M (C [0, l] q ). Then, Wi{t) = /J K(s, t)za*(s)ds + d { i G a, where 

w*{s) = V f 1 K(s,t')Xj,{t')X*{dt'). 
i>ea Jo 
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Lemma 2. Let Wj(s) = X^fc=i stifc ^ e a compound Poisson process, where N(s) is a 
homogeneous Poisson process on [0, 1] with intensity parameter 7 > 0, and k > 1 
are independent and identically distributed real-valued random variables for each i € a 
with distribution u on R satisfying u({0}) = 0, and independent of N(s). Then, 
vbiit) = Jq 1 K(s, t)uj*(s)ds + di i G a, where 

w*{s)= f Ciexpiv riiC( S ,^x7(t)A*(dt)lu(^). 



4 Approximation of the maximum entropy on the mean 
functional estimator 



To approximate the functional calibration weights and the functional maximum 
entropy on the mean estimator for the finite population mean of Y(t) with the 
assumed prior measure, an Euler discretization scheme is used. Consider a partition of 
(s, t) € [0, l] 2 in J and L equidistant fixed points, (j — 1)/J < Sj < j/J, j = 1, . . . , J, 
(I — 1)/L < ti < l/L, I = 1, . . . ,L, respectively. For the corresponding prior measures, 
the approximations for functions Z V (X), H V (X) and X*(t) are based on the respective 
results found in the Appendix. 



4.1 Centered Gaussian measure 



For a prior centered Gaussian random measure, the approximations of the linear 
moment calibration constraint (2) and the inner product (A, OV) are, respectively, 
given by 



E K(s,,t l )AW l (s :j )X i (ti) + diXi(ti 

j=l ida 



N»x(ti 



and 



1 L J L 

1 E AT ^) E E K{sj,ti)*Wi(si)Xi(ti) + AT (^) E ^Xifr 



1=1 



j=l i£a 
L 



ida 



= 1 E E E K(8j,ti)*Wi(*j)* T (ti)Xi(ti) + \ E d i E A T (to^i(*i), 

j=l ida 1=1 ida 1=1 

where AWj(sj) = Wi(sj) — Wj(sj_i) is the discrete version of dWi(s) for i £ a. 

Therefore, we have that Z„(A) is approximated at the grid (see equation (6) of the 
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proof of Lemma 1 in the Appendix) by 

r l j L 

ex p { t E di E * T (*^fa) + 7 E E E K ( s i> ^(toxi^AWiii 



E 



i£a 1=1 



j=l i£a 1=1 



= ex p I i E * E * T (^)*^) + E ^7 f i E E ^(^-, ^)A T (i/)x i (t z )'j 

[ iGa Z=l j=l V iGa 1=1 ) 

= exp | iE^E AT ( i «)^i(^) | ri exp | ^jEE^( s i)^'( s i) f 

= ex p { \ e ^ e ^(tox^) (hi( Sj )) , 

l iea Z=l J j=l 

where hi(sj) = L^ 1 Ya=i K ( s j, ti)\ T (U)Xi(ti), i 6 a, j = 1, . . . J, and Z = 1,...L. 

Now, the finite dimensional maxentropic solution for Wi(sj) for each i £ a is 
approximated by (see Gzyl and Velasquez [14]) 



dlogzi (hi(sj)) 



" d(2J)-i/ li ( Sj ; 

E^'( s j) 



hi(sj)=K,\* 



i'ea 



hi(sj)=KX* 



(4) 



iEE^'^) A * T ^)^'^)' 



;=1 i'ga 



where the finite dimensional version of A*(^), (7 — 1)/L < t\ < l/L, I = 1, . . . , L, is the 
minimizer of H V (X), whose approximation (see equation (7) of the proof of Lemma 1 
in the Appendix) is 



L L 



^EE aT ^) ^E^'W^EE 1 ^) a w 

1=1 1=1 V j=l i&a i'£a J 

+ i E ( E *^r(*o - *(*«)■ 

Z=l \iea / 

The first order condition (see equation(8)) associated to this minimization problem is 
1 J L 

jj2 E E *) K M) E E ^(*o^J(ii)A*(^) 



j=i i=i 



+7 (iv^^-E^^)) = ' 
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whose solution A*(iJ) is given by 



J L 



\*{t[) = JL [ ^^^.^^^^^^(tOlJtti) 

, j=l 1=1 iGa i'ea 



Finally, the approximation of the finite dimensional solution of Wi(t) is 

J 



Wi(ti) = -j^2K(sj,ti)w*(sj) + di 



where xu*(sj) es given by the equation (4). 
4.2 Compound Poisson measure 

Based on equations (9) and (10) of the proof of Lemma 2 in the Appendix, the 
approximation of Z v {\) is given by 



exp | (g( Sj ), dWi) + ^ A, diXiiU)^ j 
expi ( \J2diXiitA \E V [exp{(g( Sj ),dWi)}} 



exp //a^^^AI 

x n ex p 1 7 J R {iE^E^'^ T (^(^)}-i)-(^)} 

exp j^A, II exp j ~j J R ^ exp j ~ ^ j 

exp j ^A, ^ diXj(ti)^ | JJzj (hi(sj)) , iGo, 



where /ij(sj) = L 1 ti)X T (ti)Xi(t), i G a, j = 1, . . . J, and 

(A, £ iea ^X,(t z )> = L~ l £ i£a dt Ef=i A T (t,)X i (t,). 
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The approximated maxentropic solution for Wi{sj) for each i £ a is 
dlogZi (hi(sj)) 



W*(Sj) 



hi( Sj )=K.X* 



dhi(sj) 



(5) 



hi(sj)=/CA* 



where the finite dimensional version of X*(ti), is the minimizer of H V (X), whose 
approximation, by the equation (11) of the proof of Lemma 2 in the Appendix, is 

The corresponding equation for A*(tj) that minimizes H V (X) is given by the nonlinear 
system of equations (see equation (12) in the Appendix) 



jY^K{a jt ti) ^^6exp|i|]^^( Sj ,t0^7ft)A*ft)|n(^)^ + * 



xX i (t l ) = N t i x (ti) 



Finally, as in the Gaussian measure case, the finite dimensional solution of u)j(i) is 



approximated by Wi(ti) = J 1 £)f=i K(sj, ti)vcr*(sj) + dj with w*{sj) given by the 



equation (5). 



5 Simulation study 

We shall illustrate through a simple simulation study the performance of results 
obtained in the above section. Considering a finite population Un of size N = 1000, we 
generate a functional random variable Yi(t) by the point-wise multiple linear regression 
model 

Yi(t) = a(t) + Xi{t) T (3(t) + 6i(t), i E U N , 

where a(t) = 1.2 + 2.3cos(2vrt) + 4.2 sin (2vrt), /3(t) = (/?i(t), /3 2 (t)) T with 
= cos (lOt) and /3 2 (i) = tsin(15t), Xi(t) = (X;i(t), A; 2 (t)) T , and e;(i) ~ 
A/" (0, cr|(l + i)) with cr^ = 0.1, and independent of Xi(t). The auxiliary functional 
covariates are defined by Xn(t) = Un + fi(t) with fi(t) = 3sin(37rt + 3), and 
Xi2(t) = Ui2 + f2(t) with f2(t) = — cos(7rt), where Un and Ui2 are independent and, 
respectively, i.i.d. uniform random variables on the intervals [—1, 1.3] and [—0.5,0.5]. 
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The design time points for t € [0, 1] and s € [0, 1] are tj = j/J, j = 1, . . . , J 
and si = l/L, I = 1,...,L, with J = 50 and L = 80 The Figures 1 and 2 
show, respectively, the simulated finite population auxiliary functional covariates and 
functional responses for each i G Un, and the respective finite population functional 

means, H x {t) = MXa(*)) and = ^ r_1 Z)i 6 C/^ ^(*)- Assuming a 

uniform fixed-size sampling design we drawn a sample a € Un of n = 0.12iV 
elements without replacement. For the kernel function we assumed a Gaussian one, 
K(t,s) = exp |— \t — s| 2 /2cr 2 | with cr 2 = 0.5. The random variables £j for the 

compound Poisson case are assumed i.i.d. uniform on the interval [—1, 1], and 7 = 1. 
To solve the nonlinear system of equations for A*(t/) in the compound Poisson case, 
we used the R-package BB (see Varadhan [32] and Varadhan and Gilbert [33]). 




Figure 1: Population auxiliary functional variables (gray), Xn(t) (on the left) and 
Xi2(t) (on the right). Functional finite population means, HX\{t) an d A*x 2 (*) (solid 
line) 

The graphical comparisons of the estimators for a random selected repetition are 
illustrated in the Figure 2. The figure shows, in general, a good performance, specially 
for the estimator assuming the Gaussian measure. The principal differences with 
respect to the theoretical functional finite population mean are localized on the edges, 
particularly on the left edge. The Horvitz-Thompson estimator, in both cases, has 
a little departure localized around the deep valley However our estimator has not 
this departure. A nice feature of the functional calibration method is that permits to 
check graphically how well the estimator satisfies the calibration constraints for each 
covariate, iV" 1 ^2i ea Wi(t)Xi(t) = fi x (t). This is illustrated in the Figure 3. 

To evaluate the performance of the maximum entropic functional calibration 
estimator, assuming the Gaussian and compound Poisson prior measures, 
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Gaussian measure Compound Poisson measure 




0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 

t ( 

Figure 2: Population survey functions Yi(t) (in gray), functional finite population mean 
[iY{i) (solid line), and the Horvitz-Thompson (dotted line) and functional maximum 
entropy on the mean (dashed line) estimators 




0.0 0.2 0.4 0.6 0.3 1.0 0.0 0.2 0.4 0.S 0.8 1.0 




0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.S 0.8 1.0 



Figure 3: Functional calibration constraint (1) for Gaussian (on the left) and compound 
Poisson (on the right) measures. fJ,x(t) (solid line), N^ 1 Yliea ^iCO-^iW (dash) 
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Table 1: Bias- variance decomposition of MSE 



Functional estimator 






MSE 


Bias 2 


Variance 


Horvitz-Thompson 
Maximum entropy on 
Maximum entropy on 


the mean 
the mean 


(Gaussian) 
(Poisson) 


0.2391 
0.2001 
0.2333 


0.0005 
0.0006 
0.0084 


0.2386 
0.1995 
0.2249 



we calculated its empirical bias-variance decomposition of the mean square errors and 
compare it with the functional Horvitz-Thompson estimator /2y T (t). The simulation 
study was conducted with 100 repetitions. In Table 1 we can see that, with 
respect to the Horvitz-Thompson estimator, the maximum entropic estimator has 
smaller variance and mean square error for both prior measures, particularly for the 
Gaussian prior. Although the Horvitz-Thompson estimator has smaller bias squared, 
the differences are not significant. Also, the small value for the bias confirm the 
unbiasedness of the functional maximum entropy on the mean and Horvitz-Thompson 
estimators. 

6 Concluding remarks 

In this paper we have proposed an extension to the problem of obtaining an estimator 
for the finite population mean of a survey variable incorporating complete auxiliary 
information under an infinite dimensional setting. Considering that both the survey 
and the set of auxiliary variables are functions, the respective functional calibration 
constraint is expressed as an infinite dimensional linear inverse problem, whose solution 
offers the functional survey weights of the calibration estimator. The solution of the 
problem is conducted by mean the maximum entropy on the mean principle, which 
is a powerful probabilistic-based regularization method to solve constrained linear 
inverse problems. Here we assume a centered Gaussian and compound Poisson random 
measures as prior measures to obtain the functional calibration weights. However, 
other random measures can be considered also. 

The simulations study results show that the proposed functional calibration 
estimator improves its accuracy compared with the Horvitz-Thompson estimator. In 
the simulations, both the functional survey and auxiliary variables where assumed with 
amplitude variations (variation in the y-axis) only. More complex extensions allowing 
both amplitude and phase (variation in the x-axis) variations are possible. 

Finally, a further interesting extension of the functional calibration estimation 
problem under the maximum entropy on the mean approach can be conducted following 
the idea of model-calibration proposed by Wu and Sitter [35] , Wu [34] and Montanari 
and Ranalli [21]. This may be accomplished considering a nonpar ametric functional 
regression Yi{t) = fi{Xi(t)} + £i(t), i 6 Un, t € ([0, T] to model the relation between 
the functional survey variable and the set of functional auxiliary covariates in order to 
allows a more effective use of the functional auxiliary information. 
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Appendix 



Proof of Theorem 1. The Lagrangian function associated to the restricted minimiza- 
tion problem is 

L a (w,X)=V* a (w,d) + X T (t) U x (t) - N-^M^Xiit)) , 

where X(t) is the corresponding functional Lagrange multiplier vector. The first order 
conditions are 

Wi(t) - di T 



diQi (t) 

which can be expressed as 



A(t) 1 Xi(t) = 0, iea 



Wi(t) = di \l+qi(t)\(t) T Xi(t) 



i £ a 



where, its uniqueness is guaranteed by the continuous differentiability of T>* (w, d) with 
respect to Wi(t) for all i G a, and by its strictly convexity. 

From the functional calibration restriction (1) and by the existence assumption on 
the inverse of the matrix ^2 iGa diqi(t)Xi(t)Xi(t) T for all t, the Lagrange functional 
multiplier vector is determined by 

X(t) = fediqiftXiMXiwA (j* x (t) - Ax T (*)) • 

\iea J 

Finally, replacing X(t) into the first order conditions, the calibration functional 
estimator Wi(t) of the Theorem is obtained. ■ 



Proof of Theorem 2. Csiszar [5, Theorem 3, page 775]. ■ 

Proof of Lemma 1. According to Theorem 2, the maximum of the entropy S{u || v) 
over the class V = {y <C v. Z V (X) < oo} subject to the linear moment calibration 
constraint E v [JCW] = Nfj, x (t) is attained at dv*/dv = Z" 1 ( A* ) exp { (A* , KW) } , 
where 

Z V {X) = exp |e„ [(A, KW)} + [(A, KW)] \ 

= exp jgdj j\ T (dt)X l (t) + ^£ K(s,t)X T (dt)X i (t) ] j J, 

(6) 

owing to that E v [dWi (s)] = 0, and V„ [dWi {s)\ = ds, i € a. 
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Now we proceed with the problem of finding X*(dt) € Mb (C [0, l] q ), where Mb is 
the class of bounded continuous measures, such that minimizes 

= ^EE / f t K(s,t)K(s,t')\ T (dt)X t (t)X;,(t')\(dt')ds 
+ jT 1 X T (dt) diXi(t) - Nn x (t)j . 

(7) 

The corresponding equation for X*(dt) that minimizes H V (X) is given by 

EE / f K(s,t)K(s,t')X i (t)Xj,(t')X*(dt')ds + Y J d l X i (t) = N» x (t), (8) 
which can be rewritten as 



Ida 



[ K(s,t)ly2f K(s,t')Xj,(t')X*(dt')] ds + di 

J ° \i>£a Jo J 



Xi(t) = Nfi x (t), 



obtaining, by the moment calibration constraint (2), the Lemma's result. ■ 

Proof of Lemma 2. For each i 6 a, define a random variable ((a, b]) for (a, b] C 
[0,1], 

N(b) 

m % ((a, 6]) 4 Wi(6) - W*(o) = ^ & fc . 

fc=JV(a)+l 

By the Levy-Khintchine formula for Levy processes, the moment generating 
function of the ra-dimensional compound Poisson process W(s) is given by 



E„[exp{(a,W(s)>}] =exp(s 7 I 



1 u {d£ k ) \, a£ 



where £ k = . . . , Cnk) T ■ This formula can be generalized for a continuous function 
g(s) from [0,1] to R and defining {^(s) ) Wi) = Jq g(s)dWi(s) for each i £ a, which is 
approximated by # m, ((s^-i, sy]), with Sj = j/ J, j = 1, . . . , J. Thus, by 
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the independence of ((a, b]), we have that 



E v [ex.p{(g(s),dWi}}] = lim Y\E v [exp {g m» Sj])}] 

J— >oo A 

J 

lim TT exp {E„ [exp {5 (s_,-_i) &}]} 

J— >oo A A 

J'=l 



(9) 



lim TTexpM / (exp {g (sy-i) &} - 1) u \ 
explj ds (exp {5 (s) &} - 1) u (d£. 



i E a. 

Now, by the Theorem 2, the maximum of the entropy S over the class V subject 
to E„ [/CW] = Nfi x (t) is achieved at du*/dv = Z~ x (A*) exp { (A* , /CW) } with 

(A,/CW)= f X T (dt) [ 1 y2K(s,t)X t (t)dWi(s)+ f \ T {dt)Y] diX.it) 
Jo Jo i&a Jo 

= {g(8),W i ) + (\,J2d i X i (t)), 



where </(*) = £ A T ((ft) £ i£a *)*i(t). 
Therefore, 



Z V (X) = expiry J ds J (exp {5 (s) &} - 1) it exp | ^A, ^ (kXi(t) 

= exp| 7 ^ ds^(exp{ 5 (s)^}-l)u(^) + /A,^cZiX i (t)\| 



(10) 



Finally, as in the proof of Lemma 1, the problem is concentrated to find A*(i) such 
that minimizes 



H V (X) = 1 J\s 1^ (^xp^j\ T (dt)Y,K( S ,t)^X i (t)^ -lju(dti) 

+ j\ T (dt) fediXi(t)-Ni* x (fij . 

The corresponding equation for A* (eft) that minimizes H V (X) is given by 

^K(s,t) ^eiexpjg^if^^xT^A^^Iu^eiH ds + di 



(11) 



E 

xXt(t) = Nfj, x (t), 
obtaining, by the moment calibration constraint (2), the Lemma's result. 



(12) 
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